Methods and Cost Models for XPath Query Processing in Main Memory Databases
نویسنده
چکیده
Recent work on XPath evaluation has produced efficient relational index structures for maintaining and querying XML through a DBMS. Built on top of an relational encoding, named the XPath Accelerator, this thesis takes a closer look at its utilization within the scope of query processing. Basic XPath operations, such as axis steps and simple node tests, remain in the focus of the study. Appropriate database operations for their evaluation are introduced in the context of the main memory DBMS Monet. In those cases where the existing database operators fail to exploit the tree properties of XML data, new algorithms have been developed, designed specifically for evaluation of XPath axes. As an important step towards cost analysis for the proposed XPath operations, result size estimation is discussed in the trade off between accuracy and expense. Different methods show how statistical data as well as sampling techniques can be used for estimating result sizes of simple axis steps. The generation of cost functions mainly considers the time, that the XPath operations spend on data access. Even in main memory databases, CPU processing usually gets stalled for outstanding memory fetches. Therefore, our cost functions explicitly analyze the cache usage of the operations, adopting a hierarchical memory access model. Detailed tests demonstrate the accuracy and performance of the proposed result size and cost estimation techniques.
منابع مشابه
Relational Databases Query Optimization using Hybrid Evolutionary Algorithm
Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...
متن کاملانتخاب مناسبترین زبان پرسوجو برای استفاده از فراپیوندها جهت استخراج دادهها در حالت دیتالوگ در سامانه پایگاه داده استنتاجی DES
Deductive Database systems are designed based on a logical data model. Data (as opposed to Relational Databases Management System (RDBMS) in which data stored in tables) are saved as facts in a Deductive Database system. Datalog Educational System (DES) is a Deductive Database system that Datalog mode is the default mode in this system. It can extract data to use outer joins with three query la...
متن کاملPushing XML Main Memory Databases to their Limits
The wide distribution of XML documents and the standardization of the Query languages XPath and XQuery have led to a wide variation of XML database implementations. Yet the efficient processing of really large XML documents is still supported by just a few products such as e.g. MonetDB/XQuery as open-source solution [1] or X-Hive as commercial product [2]. Following the main memory and relation...
متن کاملFluXQuery: An Optimizing XQuery Processor for Streaming XML Data
XML has established itself as the ubiquitous format for data exchange on the Internet. An imminent development is that of streams of XML data being exchanged and queried. Data management scenarios where XQuery [11] is evaluated on XML streams are becoming increasingly important and realistic, e.g. in e-commerce settings. Naturally, query engines employed for stream processing are main-memory-ba...
متن کاملA Dynamic Load-balancing Scheme for XPath Queries Parallelization in Shared Memory Multi-core Systems
Due to the rapid popularity of multi-core processors systems, the parallelization of XPath queries in shared memory multi-core systems has been studied gradually. Existing work developed some parallelization methods based on cost estimation and static mapping, which could be seen as a logical optimization of parallel query plan. However, static mapping may result in load imbalance that hurts th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003